Revealing Disease Similarities by Text Mining
نویسندگان
چکیده
Texts written in human language contain structured information that is not easily parsable by computers. Text mining relies on large text corpora to derive rules which can be used by automatic means to extract automatically such information. Scientific literature represents the main source of information to study any biological phenomenon. While some phenomenon are studied to the point that corpora can actually be build, scientific literature describing rare diseases is scarce implying an even bigger challenge for automatic approaches. In order to tackle this problem the ELIXIR infrastructure is supporting various initiatives for data integration in different field of life sciences, including rare diseases, which will pave the way to the development of dedicated pieces of software. In this work we present a tool which applies a text-mining strategy to multiple text sets and merges individual results in order to infer not explicitly written connections.
منابع مشابه
خوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملAn Incremental Algorithm to find Asymmetric Word Similarities for Fuzzy Text Mining
Synonymy – different words with the same meaning – is a major problem for text mining systems. We have proposed asymmetric word similarities as a possible solution to this problem, where the similarity between words is computed on the basis of the similarities between contexts in which the words appear, rather than on their syntactic identity. In this paper, we give details of an incremental al...
متن کاملارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملA generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment
The management of unstructured data is recognized as one of the major unsolved problems in the information industry and data mining paradigm. Unstructured data in computerized information that either does not have a data model and there are not easily usable by data mining. This paper proposes a solution to this problem by managing unstructured data in to structured data using legacy system and...
متن کاملMining at Detail Level Using Conceptual Graphs *
Text mining is defined as knowledge discovery in large text collections. It detects interesting patterns such as clusters, associations, deviations, similarities, and differences in sets of texts. Current text mining methods use simplistic representations of text contents, such as keyword vectors, which imply serious limitations on the kind and meaningfulness of possible discoveries. We show ho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017